Skip to content

fix: mem0 grouped runs lose all groups after the first (telemetry qdrant lock)#27

Merged
groksrc merged 1 commit into
mainfrom
fix/mem0-telemetry-lock
Jun 12, 2026
Merged

fix: mem0 grouped runs lose all groups after the first (telemetry qdrant lock)#27
groksrc merged 1 commit into
mainfrom
fix/mem0-telemetry-lock

Conversation

@groksrc

@groksrc groksrc commented Jun 12, 2026

Copy link
Copy Markdown
Member

Root cause (named by PR #25's error capture)

mem0 with MEM0_TELEMETRY on (default) opens a qdrant client at a fixed path (~/.mem0/migrations_qdrant) inside every Memory(). qdrant local mode permits one client per path per process; the grouped runner retains the previous provider (for version_info), so the lock never releases — first group succeeds, all later groups die with Storage folder ... is already accessed. Matrix v1 lost 24/25 LME + 30/30 ConvoMem mem0 groups. Isolated repros passed because reassignment let the old client GC — hence the ghost hunt.

Fix

  • MEM0_TELEMETRY=false (setdefault) before the deferred mem0 import — benchmark runs shouldn't emit telemetry anyway; operators can force-enable.
  • cleanup() explicitly closes vector_store/_telemetry_vector_store clients and drops the reference, releasing locks even while the instance stays referenced.

Verification

Reproduced the exact failure mode (sequential groups, retained references): fails on group 2 before, three groups clean after. 2 new unit tests; suite green, lint clean.

🤖 Generated with Claude Code

…ant lock)

Root cause (from PR #25's error capture): mem0 with MEM0_TELEMETRY on
(its default) opens a qdrant client at a FIXED path
(~/.mem0/migrations_qdrant) inside every Memory(). qdrant local mode
allows one client per path per process, and the grouped runner keeps
the previous provider referenced for version_info, so the lock never
releases — first group succeeds, every later group dies with 'Storage
folder ... is already accessed'. Matrix v1 lost 24/25 LongMemEval and
30/30 ConvoMem mem0 groups this way; isolated repros passed because
reassignment let the old client GC.

- MEM0_TELEMETRY defaults to false before the deferred mem0 import
  (benchmark runs shouldn't emit telemetry anyway); operator can still
  force-enable.
- cleanup() explicitly closes vector_store and _telemetry_vector_store
  clients and drops the Memory reference, releasing locks even while
  the instance stays referenced.

Verified under the exact failure conditions: three sequential grouped
ingests with retained provider references, all OK.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Signed-off-by: Drew Cain <groksrc@gmail.com>
@groksrc groksrc merged commit 918c996 into main Jun 12, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant